Skip to content

Refactored retry config into _retry.py and added support for exponential backoff and Retry-After header #871

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: fcm-http2
Choose a base branch
from

Conversation

jonathanedey
Copy link
Contributor

@jonathanedey jonathanedey commented Apr 14, 2025

Adds more retry logic support

This PR covers:

  • Moving and refactoring HttpxRetryTransport to _retry.py
  • Adding new class HttpxRetry to manage retry state
  • Adding support for setting some retry configurations on HttpxRetryTransport creation
  • Performing exponential backoff before request retries
  • Respecting Retry-After headers from request responses
  • Unit tests for HttpxRetry and HttpxRetryTransport
  • Unit test for handing HTTPX request errors

@jonathanedey jonathanedey added the release:stage Stage a release candidate label Apr 14, 2025
Copy link
Member

@lahirumaramba lahirumaramba left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Jonathan! Overall looking good! Added a few comments!

class HttpxRetry:
"""HTTPX based retry config"""
# TODO: Decide
# urllib3.Retry ignores the status_forcelist when respecting Retry-After header
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean urllib3.Retry retries other status codes (that's not in status_forcelist) with Retry-After headers?

return False

# Placeholder for exception retrying
def is_retryable_error(self, error: Exception):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is best not to leave placeholders in PRs if we can. If this is currently a no-op we can remove it and add it when we need it

self.history.append((request, response, error))


# TODO: Remove comments
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can remove these. Or add a comment on what it doesn't cover as a TODO for future. I prefer we use a bug/issue to track these instead of TODO comments when possible


def __init__(
self,
status: int = 10,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to call this max_retries instead?

response, error = None, None

try:
logger.debug('Sending request in _dispatch_with_retry(): %r', request)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we plan to keep these logs in the production code?

if error and not retry.is_retryable_error(error):
raise error

retry.increment(request, response)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we pass the error here? retry.increment(request, response, error)

return response
if error:
raise error
raise Exception('_dispatch_with_retry() ended with no response or exception')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's narrow this down to an Exception type RuntimeError or AssertionError might work better here.


logger = logging.getLogger(__name__)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to set the log level in production code?

# Simulate attempt 6 completed
retry.increment(self._TEST_REQUEST, response)
# History len 6, attempt 7 -> 0.5*(2^4) = 10.0
assert retry.get_backoff_time() == pytest.approx(10.0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be 0.5 * (2^5) = 16.0? and clammed at 10 due to backoff_max? If that's the case let's clarify it here :)

self.status_forcelist = status_forcelist
self.backoff_factor = backoff_factor
self.backoff_max = backoff_max
self.raise_on_status = raise_on_status
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If raise_on_status is unused we should just remove it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release:stage Stage a release candidate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants